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Abstract — Minimum mean squared error (MMSE) estimators 
of signals from samples corrupted by jitter (timing noise) and 
additive noise are nonlinear, even when the signal prior and 
additive noise have normal distributions. This paper develops a 
stochastic algorithm based on Gibbs sampling and slice sampling 
to approximate the optimal MMSE estimator in this Bayesian for- 
mulation. Simulations demonstrate that this nonlinear algorithm 
can improve significantly upon the linear MMSE estimator, as 
well as the EM algorithm approximation to the maximum like- 
lihood (ML) estimator used in classical estimation. Effective off- 
chip post-processing to mitigate jitter enables greater jitter to be 
tolerated, potentially reducing on-chip ADC power consumption. 

Index Terms — sampling, timing noise, jitter, analog-to-digital 
conversion, Markov chain Monte Carlo, Gibbs sampling, slice 
sampling 

I. Introduction 

Reducing the power consumption of analog-to-digital con- 
verters (ADCs) would improve the capabilities of power- 
constrained devices like medical implants, wireless sensors, 
and cellular phones. Clock circuits that produce jittered (noisy) 
sample times naturally consume less power than those with 
low phase noise, so allowing high phase noise is one avenue 
to reduce power consumption. However, increasing jitter in 
an ADC reduces the effective number of bits (ENOB) (rms 
accuracy on a dyadic scale) by one for every doubling of 
the jitter standard deviation, as described in fTl and ||2|. 
Compensating for the reduced ENOB by designing more 
accurate comparators increases power consumption by a factor 
of four for every lost bit of accuracy |3|. Thus, to achieve 
reduced on-chip power consumption, the lost bits should be 
recovered in a different manner. 

In |4|, the authors post-process the jittered samples, employ- 
ing an EM algorithm to perform classical maximum likelihood 
(ML) estimation of the signal parameters. This nonlinear clas- 
sical estimation technique is capable of tolerating between 1.4 
and 2 times the jitter standard deviation that can be mitigated 
by linear estimation. In this work, nonlinear post-processing is 
extended to the Bayesian framework, where the signal parame- 
ters are estimated knowing their prior distribution. Here, we do 
not require that signal and noise variances are known a priori; 
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our hierarchical Bayesian model includes prior distributions 
on these parameters. The technique presented here achieves 
significant improvement over linear estimation for a wider 
range of jitter variance than the EM algorithm from H, 
improving the applicability of nonlinear post-processing. 

The block post-processing of the jittered samples is intended 
to be performed off-chip (e.g. on a PC), so we do not attempt 
to optimize the total power consumption, including the digital 
post-processing. However, we are concerned with making 
prudent choices in algorithm design so that the computational 
complexity of the post-processing is reasonable. The problem 
of mitigating jitter also can be motivated by loosening manu- 
facturing tolerances (hence reducing cost) or by problems in 
which spatial locations of sensors are analogous to sampling 
times H- 

A. Problem Formulation 

Consider the shift-invariant subspace of L^(R) associated 
with a generating function li{t) and a signal x{t) in that 
subspace: 

x{t)^Y.''kh{t/T-li). (1) 

kez 

Assuming {h{t/T — fc) : fc G Z} is a Riesz basis for 
the subspace, x{t) is in one-to-one correspondence with the 
sequence {xk}kei and the sequence {xk}ke'L is in P{Ij). 
Examples of /i(i) include the function sinc(t) — ^™l^^ 
used throughout this paper, as well as B-splines and wavelet 
scaling functions as discussed in (|6l. While /i(t) — sinc(i) 
is used for simulations, the developments in this paper are 
not specialized to the form of /i(t) in any way. We only 
require that h{t) satisfies the Riesz basis condition and that the 
sampling prefilter s(— i) satisfies the biorthogonality condition 
{h(t/T - fc), s{t/T - £)) = Sk-i, for all fc, £ £ Z. The Riesz 
basis condition allows bounding of L^ error of x{t) in terms 
of £'^ error of Xk', when {h{t/T — k) : fc e Z} is an orthogonal 
set, these errors are constant multiples. When hit) — sinc(i), 
the shift-invariant subspace is the subspace of signals with 
Nyquist sampling period T. Without loss of generality, we 
assume T = 1. 

When observing the signal x{t) through a sampling system 
like an ADC, the analog signal is prefiltered by s{—t), and 
samples y„ are taken of the result at jittered times t„ = 
nTs + Zn- To model oversampled ADCs, we oversample the 
signal by a factor of M, so the sampling period is Tg = l/M. 
The samples are also corrupted by additive noise Wn, which 
models auxiliary effects like quantization and thermal noise. 
For h{t) = sinc(i), the dual s{t) = sinc(t) is an ideal lowpass 
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Fig. 1. Block diagram of an abstract ADC with off-chip post-processing. The 
signal x{t) is filtered by the sampling prefilter s{—t) and sampled at time 
t„. These samples are corrupted by additive noise Wn to yield y„. The post- 
processor estimates the parameters x of x(t) using the vector of N samples 
y from the ADC. 



filter with bandwidth 27r. The observation model, depicted in 
Figure [T] is 



Vn 



Kt) * s(-i)]t=4V+.„ + 



Wn 



(2) 



We aim to estimate a block of K coefficients, assuming the 
remaining coefficients are negligible: 



K-l 



= (i) W ^Xfe/l(t-fc). 



fe=0 

This specializes the observation model to 

K-l 



Vn 



^ Xkhi 



k=0 



+ Zn 



W„ 



(3) 



(4) 



Grouping the variables into vectors, let x = [xq,. . . , xk-i]'^, 

y = [yo,---,yN^i]'^, z = [zo,...,zn-i]'^, and w = 
[wq, . . . , wn-i]'^ ■ Then, in matrix form. 



H(z)x- 



(5) 



where [H(z)]„,fe ^ h{^ + z„ - fc), forn = {0, . . . , A^ - 1}, 
and fc = {0, . . . , AT- 1}. Let h^(z„) be the nth row of H(z). 
Also, denote the fcth column of H(z) by Hi,(z) and the matrix 
with the remaining K — 1 columns by Hyit(z). Similarly, let 
yi\k = [xq, . . . ,Xk-i,Xk+i, ■ . ■ ..XK-iV be the vector of all 
but the kt\\ signal coefficient. 

In this paper, we assume both the jitter and additive noise 
are random, independent of each other and the signal x{t). 
Specifically, z„ and it;„ are assumed to be iid zero-mean 
Gaussian, with variances equal to a1 and cr^, respectively. In 
keeping with the Bayesian framework, we also choose a prior 
for the signal parameters. For convenience, we use an iid zero- 
mean Gaussian prior with variance a^ because the observation 
model is linear in the parameters. Rather than assuming these 
parameters (variances) to be known, we treat them as random 
variables and assign a conjugate prior to these parameters. 
Thus, a1, CT^, and a^ are inverse Gamma distributed with 
hyperparameters {az,l3z}, {a™,/?^}, and {a,j.,(3^}, respec- 
tively. These hyperparameters may be selected to be consistent 
with in-factory measurement of the noise variances or other 
information. The hierarchical Bayesian model is shown in 
Figure |2] 

To simplify notation, the probability density function (pdf) 
of a is written as p{a), and the pdf of b conditioned on 
a is abbreviated as p(b | a) for random a and p{h; a) 
for nonrandom a. The subscripts usually included outside 
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Fig. 2. Hierarchical Bayesian model of the problem. The observation y„ 
depends on coefficients x'o, . . . ,xk—i and jitter and additive noise Zn and 
Wn- The coefficients all depend on the signal variance a'^, and the jitter and 
additive noise depend on a^ and ct^ , respectively. Each of these vaiiances 
depend on hyperparameters a and /9. In this model, circled nodes are random 
variables, and non-circled nodes are fixed parameters. 



the parentheses will be written only when needed to avoid 
confusion. Expectations will follow the same convention. 

The uniform distribution is written in this paper as t/(set); 
for instance, U{[a,b]) is a uniform distribution over the 
interval [a, b], and U{{u : p{u) > c}) is a uniform distribution 
over the set {u : p{u) > c}. Writing u ^ [/(set) means 
that u is a sample generated from this distribution; analogous 
notation is used for the other distributions in this paper. The 
inverse Gamma distribution has the density function 



Xg(s;a,/3)^^s-"-ie-^/^ 



Via) 
The mean and variance of s are 



P 



a-1 



and var(s) = 



13' 



(a-l)2(a-2)' 



(6) 



(7) 



The density function of the d-dimensional normal distribution 
with mean fi and covariance matrix A is written as 

A/-(a;//,A) ^ \27tA\-'/' exp{-^{si-tif A-\a-fi)}. (8) 

When performing simulations, specific values are required 
for the as and /3's. For the signal variance u^, consider an 
unbiased estimate of that variance from K > I observa- 
tions generated from a standard normal distribution: sk = 
IT^ Y.k^o'i^k - xf, where x^ ^ EfeLo^ ^k is the sample 
mean. Then, we fit the inverse Gamma prior hyperparameters 
ax and /3x to the mean and variance of sk using ^: 



Px 



Solving, 
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K + 1 



var(sK) 



K-l 
(9) 



(10) 
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Similarly for the zero-mean jitter and additive noise variances, 
given A'' > 1 observations and expected noise variances E[(7^] 
and E[al], 

Oiz^oiy,^ — - — , ^^ = — - — E [al] , and /?^ = — - — E 

(11) 
For the examples in this paper, we use the same K and N as 
for our signal; in practical applications, K and N are prior 
observations performed at a factory (for the noise variances) 
or elsewhere (for the signal variance). 

The objective of the algorithm presented in this paper is 
to find the estimator x that minimizes the mean squared 
error (MSB) E [||x(y) — XII2], where the observations y are 
implicitly functions of x. Unlike in the classical estimation 
framework, we have a prior on x, which allows us to formulate 
the minimum mean squared error (MMSE) estimator xmmse 
as the posterior expectation 



Preliminary versions of the algorithms and results presented 
in this work are also discussed, with further background 
material and references, in lil7J . 



Xmmse = E[x | y]. 



(12) 



The posterior distribution p(x | y) depends on the likelihood 
function p(y | x), which can be expressed as in 14J as a 
product of marginal likelihoods: 



PiVn I x) 



AA(2/„;h^(z„)x,a^)AA(z„;0,a2)I^(a2 



(13) 

As neither the likelihood nor posterior distribution has a 
simple closed form, the majority of this paper is devoted to 
approximating these functions using numerical and stochastic 
methods. 

B. Related Work 

Random jitter has been studied extensively throughout the 
early signal processing literature (see Q, JD, and [ID). How- 
ever, much of the effort in designing reconstruction algorithms 
was constrained to linear transformations of the observations. 
These papers also analyze the performance of such algorithms; 
for example, |!9l proves that when the jitter is Gaussian and 
small enough, the MSB is approximately ^il'^a^, where the 

input PSD Sxxij^) = 2n~ ^^ ^^'■' ^^^ '■° ^^^^ °^ attention 
to nonlinear post-processing, it is not readily apparent from the 
literature that these linear estimators are far from optimal. The 
effects of jitter on linear MMSE reconstruction of bandlimited 
signals are discussed in |101 and extended to the asymptotic 
K,N ^ 00 case and multidimensional signals in [11 J. 

More recently, lfT2l uses a second-order Taylor series ap- 
proximation to perform weighted least-squares fitting of a 
jittered random signal. In |13|, two post-processing methods 
are described for the case when the sample times are discrete 
(on a dense grid). Similar to the Gibbs sampler presented 
in this work, lfT4l uses a Metropolis-Hastings Markov chain 
Monte Carlo (MCMC) algorithm to estimate the jitter and 
jitter variance from a sequence of samples. Also, a maximum 
a posteriori (MAP)-based estimator is proposed in ifTSJI to 
mitigate read-in and write-out jitter in data storage devices. 
Finally, a Gibbs sampler is developed in |16| to estimate the 
coefficients and locations of finite rate of innovation signals 
from noisy samples. 



Outline 

In Section HIl numerical quadrature is revisited and Gibbs 
sampling and slice sampling are reviewed. The linear MMSE 
estimator is discussed in Section |lll] In Section IIVI the 
Gibbs sampler approximation to the Bayes MMSE estimator 
is derived, and slice sampling is used in the implementation. 
All these estimators, as well as the EM algorithm from H 
approximating the ML estimator, are analyzed and compared 
via simulations in Section |Vl Conclusions based on these 
simulations, as well as ideas for future research directions, 
are discussed in Section IVll 

II. Background 

In general, the likelihood function in the introduction is 
described in terms of an integration without a closed form. For- 
tunately, numerical methods such as Gauss quadrature, which 
approximates the integration in question with a weighted sum 
of the integrand evaluated at different locations (abscissas), are 
relatiyeW accurate and eff|c^ig;. A more detailed description 
' df X^us^ 'quMclramre &n be fdiihd in the background section 
of |4|, or in |18| or |19|. This paper discusses using Gauss- 
Laguerre quadrature to approximate integration with respect 
to 0-^ and a^^. 

However, simply being able to evaluate (approximately) the 
likelihood function is insufficient to approximate the Bayes 
MMSE estimator To approximate the expectation in ( fT2] i. we 
propose using a Monte Carlo statistical method combining 
Gibbs sampling and slice sampling. Gibbs sampling and slice 
sampling are discussed below. 

A. Numerical Integration 

For integrals of the form J_ f{x)N'{x;^,a'^)dx, tech- 
niques such as Gauss-Legendre and Gauss-Hermite quadra- 
ture, Romberg's method, and Simpson's rule, are described 
in H. Similarly, Gauss-Laguerre quadrature can approximate 
integrals of the form L f{x)x°-e~^ dx. The abscissas and 
weights for Gauss-Laguerre quadrature can be computed using 
the eigenvalue-based method derived in EOl . 

Let Xj and Wj be the abscissas and weights for the Gauss- 
Laguerre quadrature rule of length J. Then, we can integrate 
against the pdf of the inverse Gamma distribution by observ- 
ing, 

/•oo oa 

f{x)ig{x- a, /?) dx = / -il-/(a;)x-("+i)e-^/- dx 

Jo r(a) 

^^f(^A('-X"^e-ydy 

^ f(^\v''-'e-ydv 



r(a) \y 



(14) 



- Gauss-Hermite quad. 
Romberg's method 

- Simpson's rule 

- Gauss-Legendre quad, 
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(a) K = 10, M = 4, E [a^] = O.TS^, E [a^] = 0.1^, Ji 
Ja = 9, J3 = 129, n = 19. 
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Fig. 3. Quadrature approximations are compared to histograms for p{yn \ 
x) for differ ent e xpected values of a^ and a^ (a's and /3's are computed 
according to jilt ). The quadrature approximations are computed for a dense 
grid of 200 values of y„, and the histograms are generated from 100 000 
samples of j/„, computed from samples of tr^, cr^, 2,1, and Wn according 
to 0. The multimodal case |(a)| favors Gauss-Legendre quadrature; case |(b)| 
favors Gauss-Hermite quadrature; the worst-case n is shown in each. The 
legend refers to the quadrature method used for the integral over z„; Gauss- 
Laguerre quadrature is used for the integrals with respect to (T^ and (T^ . 



with respect to crj and cr^- This is combined with Gauss- 
Hermite quadrature with J3 ~ 129 when E[crf] is small 
(< 0.01) and Gauss-Legendre quadrature with J3 = 129 
when K[a'^] > 0.01. This hybrid quadrature also is used when 
computing the expectations in Section |lll] and in the appendix. 

B. Gibbs Sampling 

The Gibbs sampler is a Markov chain Monte Carlo method 
developed in f2T|. Details about the Gibbs sampler and its 
many variants, including Metropolis-within-Gibbs sampling, 
can be found in 1221 . When implementing the Gibbs sampler, 
one must consider both the number of iterations until the 
Markov chain has approximately converged to its stationary 
distribution (the "burn-in time") and the number of samples 
that should be taken after convergence to compute the MMSE 
estimate. According to [231 . separating highly correlated vari- 
ables slows convergence of the Gibbs sampler The number of 
iterations after convergence is connected to both correlation 
between successive samples and the variance of the random 
variables distributed according to the stationary distribution. 

To monitor convergence, heuristics such as the potential 
scale reduction factor (PSRF) and the inter-chain and intra- 
chain variances are developed in ll24l . ||251 . Consider C in- 
stances (chains) of the Gibbs sampler running simultaneously. 
Define the vector Sid to be the combined vector of all the 
samples for the cth chain at the ith iteration. For chain c, the 
average is a.^ — IX^i-^i^c.i- Across all chains, the average 
is a = ^ Sc=i ^c- Then, following the multivariate extension 



to the potential scale reduction factor (PSRF) derived in ||25]| . 
define the intra-chain covariance 



where x'- = fi/xj, and -u;' = Wj/T{a). The substitutions 
X = P/y and dx = /3/y^ dy are made in the second step of 
the derivation. 

UtiUzing a combination of Gauss-Laguerre quadrature and 
either Gauss-Hermite quadrature or Gauss-Legendre quadra- 
ture, we can approximate the likelihood function p(y„ | x) 
using the integral in ( fT3] l. In particular, 

Ji J2 J3 

(15) 
In this equation, the innermost quadrature (over z„) depends 
on the value of cr^, so the values of Zj^ depend on cr^ . Since 
the total number of operations scales exponentially with the 
number of variables being integrated, we seek to minimize 
the choices of Ji, J2, and J3 for this three-dimensional 
summation. To explore the accuracy of this approximation 
as a function of Ji and J2 (we use J3 = 129 from yj), 
the quadrature is performed over a dense grid of values of 
y„ and the results are compared to a histogram generated 
empirically, by fixing x to a randomly chosen vector, gen- 
erating many samples of erf, a'^, Zn, and w„ from their 
respective prior distributions, and computing the samples y„ 
using (|4]l. Comparisons for unimodal and multimodal p(j/„; x) 
are shown in Figure[3] Based on these comparisons, we choose 
Gauss-Laguerre quadrature with Ji = J2 = 9 to integrate 



W,:^ 



1 



(z-l)C 



^^(ac,i -ac)(ac,i -ac)^, (16) 



c=l j=l 



and the inter-chain covariance 



B, 



1 ' 

■,-r A 



ac-a)(ac - a)"^ 



(17) 



The posterior variance V, = ^W, + ^B^, and the PSRF 
^p A i_i _|_ Cii||^-ig^||^^ ^jjgjg II . 1I2 jg jjjg induced 

matrix 2-norm. Then, the Gibbs sampler's Markov chain has 
converged when Rp = 1, and V stabilizes. To measure the 



change in V, we compute ||V 



|l/2 



C. Slice Sampling 

Slice sampling is a Markov chain Monte Carlo method 
described in [26] for generating samples from a distribution by 
instead sampling uniformly from the subgraph of the pdf and 
framing this sampling procedure as a two-stage Gibbs sampler, 
depicted in Figure |4] 

The difficulty of slice sampling is in representing and 
sampling from the slice. In this problem, we show that any 
given slice is bounded, and therefore, an interval containing the 
slice can be constructed, and the "shrinkage" method described 
in ||26l can be used. The shrinkage method is an accept-reject 
method, where given an interval [L, R] containing part (or all) 
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x« ^U{{x -.pix) >2/(*)}) 




Fig. 4. Slice sampling of p{x) illustrated: (a) Sampling is performed by 
traversing a Markov chain to approximate p{x), the stationary distribution. 
Each iteration consists of (b) uniformly choosing a slice {x : p{x) > y} and 
uniformly picking a new sample x from that slice. 



of the slice, a sample is generated uniformly from the interval 
and accepted if the sample is inside the slice. If the sample 
is rejected, the interval shrinks to use the rejected sample 
as a new endpoint. Several variants, including shrinking to 
the midpoint of the interval instead of or in addition to the 
rejected sample, are also described in the rejoinder at the end 
of |26|. These variants are compared in the context of the jitter 
mitigation problem in Section |IV] 

III. Linear Bayesian Estimation 

When block post-processing the samples, the linear 
Bayesian estimator with minimum MSE is called the linear 
MMSE (abbreviated LMMSE) estimator The general form of 
the linear MMSE estimator is given in |27|. For estimating the 
random signal coefficients x using the hierarchical Bayesian 
model in Section H] 



/3x 



1 



Ay = 



PX 



1 



E[H(z)]^, 
E[H(z)H(z) 



1 



(18) 
(19) 



and /Xy 
is 



^x = 0. The LMMSE estimator for random jitter 

Piu{ax - l! 



XLMMSE(y) =E[H(z)]^ E[H(z)H(z) 



I3x{a-u 



which the nonlinear Bayesian estimators derived later are 
measured. The error covariance of this estimator is 



■'^LMMSE|z=0 — <^x 



H(0)^ (h(O)H(O)^ 



[iw[ax - 1) 

Px{aw ~ 1) 

(23) 



H(0) 



IV. Nonlinear Bayesian Estimation 

To improve upon the LMMSE estimator, we expand our 
consideration to nonlinear functions of the data. The Bayes 
MMSE estimator, in its general form in (fT2] i. is the nonlinear 
function that minimizes the MSE. However, since the posterior 
density function for this problem does not have a closed 
form, this estimator can be difficult to compute. Since we are 
interested in the mean of the posterior pdf, finding the Bayes 
MMSE estimator is an obvious application of Monte Carlo 
statistical methods, especially the Gibbs sampler described in 
Section HI] 

We propose using Gibbs sampling to produce a sequence 
of samples for the random parameters we wish to find, via 
traversing a Markov chain to its steady-state distribution, and 
average the samples to approximate the estimator. To this end, 
samples of z, x, al, a^, and a1 are generated according to 
their full conditional distributions (i.e. the distribution of one 
random variable given all the others). To generate samples of 
z, we apply slice sampling. 



A. Generating Zn using Slice Sampling 

Consider generating samples 2„ from the distribution p{- \ 
z\„,x, cr^,CT^,CT^,y), where z\„ is the random vector of 
all the jitter variables except z„. Using Bayes rule and the 
independence of z„ and Wn, 



p{z„ 



2 2 2 \ 



p{y 



Z,X,cr2^CT^,cr2)p(z,cr2)p(x | „2^^p(^^ 



p(z 



\n,X,y, (72^0-2^^2) 
2\ 



OC N{yn\ K (^n)x, <^lW{Zn] 0, <) 

(24) 



(20) 



The expectations in ( |20] | can be computed off-line using Gauss 
quadrature. The error covariance of the LMMSE estimator is 
also derived in jZTl ; for this problem. 



^LMMSE 



Px 



-1 



I-E[H(z)]^ E[H(z)H(z 



Pw{ax 



Slice sampling is used for generating realizations of z„ 

ysince no tightly enveloping proposal density or other tuning is 

necessary; the ability to evaluate an unnormalized form of the 

target distribution is sufficient. Each iteration of slice sampling 

consists of two uniform sampling problems: 

1) Choose a slice u uniformly from 

^ al)], where p(z4*^ 



[0,p(z,(^^ 



fix{aw 
(21) 
When no jitter is assumed, the LMMSE estimator simplifies 



to 



Pwjax - 1) ^ 
Pxia-u 



I) 



XLMMSE|z=o(y) - H(0)^ H(0)H(0) 

l~Jx\'->^w ^ J-; , 

(22) 

This linear estimator is the best linear transformation of the 

data that can be performed in the absence of jitter. Hence, the 

no-jitter LMMSE estimator is the baseline estimator against 



y^, . ^. toV'^)], wherep(zi*^ I y,x,<T2,cr^,cr2)isthe 
~ -* J |unngmfltoeq full conditional density function in ( |24] |. 
~ l/) /Sample Zn / uniformly from the slice S = {z„ : p{zn \ 
y>X,cr^,cr^,cr^) > u}. 
The first step is trivial, since we are sampling from a single 
interval. The second step is more difficult. However, since 
"" < pizn I y, X, 0-2^ (7^, 0-2) for all Zn in the slice. 



logM < 



(y„-h^(z„)x)2 



< 



Zn 

'2a? 



2rr2 
-\og{2Traza^). 



- TT^ -log(27rcr^cr^) 



(25) 



Solving for z„, the range of possible Zn is bounded: 



\zn\ < <7z V^2 log w - 2 \og{2TTa^a^^ 



(26) 



Using these extreme points for the initial interval containing 
the slice, and the "shrinkage" method specified in [26] to 
sample from the slice by repeatedly shrinking the interval, 
slice sampling becomes a relatively efficient method. The 
"shrinkage" method decreases the size of the interval expo- 
nentially fast, on average. To see this, consider one iteration 
of shrinkage, where the initial point xq from the previous step 
of slice sampling lies in the interval [L,R]. This initial point 
is guaranteed to be in the slice by construction. The expected 
size of the new interval [L',R'], from choosing a new point 
x', is 



Algorithm 1 Algorithm for computing z„ with slice sampling. 

Require: Previous value Zn , x, a-^, cr^, u^, y„, threshold 
r >0 

Choose u-C/([0,p(zl'"^^ | x,a2, a^, a^, , y„)]) (see dH). 



Compute initial interval [L, R] according to 
repeat {This is the "shrinkage" algorithm from 1261 .} 
Choose Z'^U{[L,R]). 



rtp{z I x,y„^al,a^^,al 
it z < Zn then 



< u then 



E[R' -L' \R,L,xo] = 



1 



(i? - x') dx' + {x' 

L J xo 



R-L 

R^-2RL + L^ xo{R + L-xo) 



L ^r- Z. 

else 

R^ z. 
end if 
end if 

L)dxii Piz I X, y„,cr2^CT^,cr^) < e^^u then {(Optional) 
ijiidpoint-threshold modification from rejoinder in |26|.} 



then 



2(R-L) R-L 

R-L xq{R + L - Xq) - RL 



R-L 



(27) 



This expectation is quadratic in a;o, so the maximum occurs 
at the extreme point xq — {R + L)/2. The maximum value is 

R-L {{R + L)/2){R + L- 



RL if i(i + i?)<zi*"'' 
L^ \{L + R). 
else 

R^ \{L + R). 
end if 
end if 
until p(z I yi,al,al,al,,yn) > u. 
return z 



niaxE[_R' — L' \ R^ L, xq] 

Xo 



{R + L)/2)-RL 



2 R 

R-L {R + L)74 - RL 



R-L 



(28) 



Concavity implies that the minima are at the two endpoints 
xq — L and xq = R. In both cases, the expected size of the 
interval is {R — L)/2. Therefore, 



-{R-L)< W.[R' -L' \R, L, Xo] < ^{R 



L) 



(29) 



which implies that at worst, the size of the interval shrinks 
to 3/4 its previous size per iteration, on average. Then, given 
the initial interval [Lo, ^o] and previous point Xq, the expected 
size of the interval [Lj, Rj] after / iterations of the shrinkage 
algorithm is 



(R-L). 



convergence speed. In an effort to mitigate the increased 
correlation, a hybrid method is proposed in ll26l that always 
shrinks to the rejected sample, then shrinks to the midpoint of 
the remaining interval only if the probability of the rejected 
sample is sufficiently small (the threshold) relative to the slice. 
These algorithms are applied to both unimodal and multimodal 
posterior distributions p(z„ | x, z,y,(T^, cr^,(T^) in Figure |5] 



E[Ri-Li \Ro,Lo,xo] 



E[E[Ri 
I 



< 



i) (^0 



If the target distribution p{x) is continuous, the algorithm 
is guaranteed to terminate once the search interval is small 
enough. Since the interval size shrinks exponentially fast, on 
average, the number of "shrinkage" iterations is approximately 
proportional to the log of the fraction of the initial interval 
contained in the slice. 

In the rejoinder at the end of |26|, an alternative binary 
search-like midpoint shrinkage algorithm is proposed that 
can converge faster on the slice than the original shrinkage 
algorithm, at the cost of increasing correlation between suc- 
cessive samples, which reduces the overall Gibbs sampler 



To compare these methods in the context of the jitter 
Lj I Rq,Lq, . . . , -R/-1, i/miitigaltibS,ow^(m®nitor the convergence of the complete Gibbs 
sampler, using the shrinkage methods described above. While 
^o)- the combined method obviously shrinks the slice much faster 

than the original method, the hybrid method's increased speed 
must offset any increased correlation in the accepted samples 
in order to be useful. In Figure |6] the convergence metrics 



(30) 



PSRF^/^ and ||V||2^^ are plotted as a function of the total 
number of shrinkage iterations performed. The convergence 
rate of the two shrinkage methods are very similar, but in some 
cases, as shown in Figure |6al the hybrid method outperforms 
the original shrinkage method. 



To summarize, pseudocode of the slice sampling algorithm 
using either shrinkage method to generate realizations of z„ 
is written in Algorithm [T] 
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(a) K = 10, M = 16, (Tz = 0.1, aw = 0.05, n = 80, original (left) and thresholding-based (right) methods. 



Or 

T -50 

N^-100^ 
M-150- 
^ -200 _ 

«, 1 
I 2: 
? 3: 

£ 6: 
■5 7: 




L ^ 2 



L^2 



slice 



./!_ 



l('-l) i? ^ 2 



R^ z 



R^ z 



L * r- 2 
L <— 2 




(b) A" = 10, M = 4, CTz = 0.5, cTin = 0.075, n = 20, original (left) and thresholding-based (right) methods. 

Fig. 5. Comparisons between the original shrinkage method and the modified thresholding-based shrinkage method for unimodal |(a)| and multimodal |(b)| 

I X, (T^, cr^, cr^, j;„) is evaluated for the previous Gibbs sampler iteration's zil and the 
]. The shrinkage methods proceed according to Algorithm [T] The midpoint threshold shown 



(i-l) 



posterior distributions. The unnormalized distribution p{z. 

slice level shown is selected uniformly from [0,p(z„ | 

corresponds to r = 25. Rejected samples are marked with a "x", and the final accepted sample is marked with a "o". In both cases, the thresholding-based 

method reduces the size of the interval more quickly than the original method. Especially in the unimodal case, the accepted sample 2;„ ' is much closer to 

(i — 1) 

the previous iterate z^ than would otherwise be expected from the size of the slice. 



B. Generating x, cr^, a^, and ctJ 

The full conditional distribution on Xk does depend on the 
other signal parameters x^j.: 



and covariance matrix 



a^[H(z)^H(z) 



(34) 



p{xk I x\fe,z,cr^,cr^^cr^,y 



Grouping correlated variables accelerates Gibbs sampler con- 
vergence, and the random vector x can still be generated in 
one simple step since 



_ p(y I z,x, o-^^Jp(z,g^)p(x I q-^)p(q-|j||a^((||) bs sampler easily handles the variances al, a\, or 
p(x\fc, z, cr^, (y'i,-,<y1-,y)^'tii being random variables. The generation of realizations of 

ex TVfv Hfzlx cr^DTVfxfc-O a^^ '^" ^"'^ ^^ proceeds using the previous iteration's estimates 

(31) °^ '^'^^ '-'z^ ^^^ '^w instead of the true variances. Each cycle 
of the Gibbs sampler generates realizations of tr^, al and a'^ 
using the observations y and the current iteration's values of 
z and X. The Gibbs sampler algorithm shown in Algorithmic] 
generates realizations from the posterior pdfs for cr^, a'^, and 
(T^. Using Bayes rule and the independence of z„ and Wn, 
these conditional pdfs are 



p(x I z,cr^,cr„,CT2,y) ocA/'(y;H(z)x,cr„I)A/'(x;0,cr^I) 

(32) 
implies the posterior distribution of x is just multivariate 
normal with mean 



/ 2 \ 22\ /2|\ 

PicTx I X, z, y, cr^ , cr.„) = p{a^ \ x) 



p(x I al)pial) 



fJ-^ 



H(z)^y 



(33) 



Pi^l 



x,z,y,CT^,CT^; 



p(x) 

^ N'{^;0,all)ig{al;ax, (3,), (35) 
p{z I al)p{crl) 



--v{oi\z) 



p(z) 
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■PSRF^'^: hybrid method 
■l|V||"^: hybrid method 
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-PSRF"': original method 
: original method 

-PSRF^'^: hybrid method 
: hybrid method 



Algorithm 2 Pseudocode for the Gibbs sampler modified to 
use slice sampling for the z„'s. 
Require: y, /, /& 

z(0) ^ 0; x(0) ^ XLMMSE|z=o(y) from (|2l; a^^"^ ^ 
a2(°V 0.01; a2^^°V 0.01 
for i = 1 :/ + /(, do 

for n = : iV - 1 do 

Generate z„^ using slice sampling in Algorithm [T| 

end for 

Generate x'*^ from 7\A(/2x, Ax) using ( l33T l and ( |34] |. 



.2W 



Generate cri^^"' from I^(a^,/3^) using 



Generate al^"' from. IQ [a' ^,j3'^) using 
Generate al}^' from I(y(a^,/3(„) using (l40l l. 
end for 
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Fig. 6. One-hundred chains of the Gibbs/slice .sampler are run for 1000 

iterations each, and the convergence as a function of the total number of 

1/2 
shrirLkage iterations is measured by the square root of the PSRF and ||V||2 . 

The values ofa^, Px, Oz, 13 z, ciw, and {3^ are determined for the expected 

values of c^, a'\, and crj, using (TO) and jilt . For a given number of 

shrirLkage iterations, the hybrid rejection-midpoint-threshold method (t = 25) 

outperforms the original rejection-based shrinkage method in |(a)| and performs 

equally well in|(b)| 



+1' 



ly-.(6 + 



^2« 



.2W 



cr; 



return x, z, az, crz, ot 



-.N{^;0,all)ig{al-a,,P,), (36) 



and 



P{^1, I x,z,y,cr^,crf) = _p(cr^ | x,z,y) = 



p(y 



are proper inverse Gamma distributed with the parameters 
described above. 

Once enough samples have been taken so that the current 
state of the Markov chain is sufficiently close to the steady 
state, the Gibbs sampling theory tells us that further samples 
drawn from the chain can be treated as if they were drawn from 
the joint posterior distribution directly. Thus, these additional 
X, z, cr^)p(x)^^pI^eB^()an be averaged to approximate the Bayes MMSE 
ply X z)6stimator In the complete Gibbs sampler in Algorithm |2] /;, 



«AA(y;H(z)x,a^I)ia(a^;a, 



(37) 



The inverse Gamma distribution is the conjugate prior for the 
variance parameter of a Normal distribution (see |28|). There- 
fore, the posterior distribution is also an inverse Gamma distri- 
bution. Specifically, p{a'^ 
where 



x,z,y. 



^l^^^l. 



ig{cjl-a',,P',) 



K 



/?; = /?.+ 



(38) 



Similarly the hyperparameters for the posterior inverse Gamma 
distributions on cr? and a^,, are 



a„ 



N 






2 

|y ■ 



H(z)x 



(39) 



(40) 



Thus, generating realizations of a1 or a^^ using such a prior 
is as simple as taking the inverse of realizations of a gamma 
distribution with the proper choice of hyperparameters. For 
those who prefer a non-informative prior, the Jeffreys priors 
for al, 0-2, and al are p{al) = l/a% p{gI) = l/ul, and 
p((tJ) ~ l/cj- Although these priors are improper distri- 
butions, they are equivalent to inverse Gamma distributions 
with a — P — Q, so the associated posterior distributions 



represents the "burn-in time," the number of iterations until 
the Markov chain has approximately reached its steady state, 
and / represents the number of samples to generate after 
convergence, which are averaged to form the MMSE estimates. 



V. Simulation Results 

In this section, both the convergence behavior and the 
performance of the Gibbs/slice sampler are analyzed. Using 
Matlab, a i^-parameter signal and N = K M samples of that 
signal are generated with pseudo-random jitter and additive 
noise; M is the oversampling factor Then, implementations of 
the Gibbs/slice sampler, as well as the linear MMSE estimator 
in (I20I 1, the no-jitter linear estimator in ( |22] |, and the EM 
algorithm developed in |4| for approximating the ML estimator 
are applied to the samples. The adaptation of the EM algorithm 
to random ct^ and cr^ is described in the appendix; however, 
the EM algorithm with known cr^„ and cr^ is used in these 
simulations because adapting to random variances dramatically 
increases the computational cost, and the difference in MSE is 
negligible. These algorithms are studied in detail for periodic 
bandlimited signals with uniformly distributed signal parame- 
ters in [17], and in this work, a similar analysis is performed 
to analyze the convergence and sensitivity to initial conditions 
of the proposed algorithms. This analysis is also similar to 
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that performed in |j4l for the EM algorithm approximation to 
the ML estimator of the non-Bayesian version of this paper's 
problem formulation. 

A. Convergence Analysis 

As a Markov chain Monte Carlo method, the Gibbs/slice 
sampler converges to the appropriate posterior distribution 
under certain conditions (see |22|); as long as the sequence 
generated by sampling from the steady-state distribution 
p(x, z, cr^, cr^, <T^ I y) is ergodic, the samples can be averaged 
to approximate the Bayes MMSE estimate of the signal 
parameters. In addition, the steady-state distribution of an 
irreducible chain is unique, so the choice of initialization 
should not impact the final estimate generated from the steady- 
state samples. Of course, since the chain only converges to the 
steady-state in the limit, small transient effects from the initial 
conditions are evaluated. 

The rate of convergence of the Gibbs/slice sampler, as 
measured by the ||V||2 and the square root of the PSRF, 
is shown in Figure |7] The results suggest that increasing the 
oversampling factor A/ or the jitter variance a1 or decreasing 
the additive noise variance a^ slows the rate of convergence. 
In most cases, the Markov chain appears to reach a steady 
state within 500 iterations; thus, we set /;, = 500 iterations 
(see Algorithm |2]i for the tests that follow. 

To establish the number of iterations / needed after burn- 
in, we observe the squared error ||x/ — x*||2, where x/ is 
the /th estimate of x, as a function of /, for / up to 1000, 
and X* is the true value of x. Examining the plots in Figure |8] 
approximately 500 iterations are sufficient to achieve a squared 
error within 0.5 dB of the asymptotic MSE (as measured by 
/ = 1000) for all cases. 

The sensitivity to initial conditions of the Gibbs/slice sam- 
pler is shown in Figure |9] for I^ — I — 500. For 50 trials, the 
squared error of the Bayes MMSE estimates are measured for 
ten different choices of initial conditions. The ten choices of 



initial conditions used are (1) ct; 



(0) 



1, ai°) 



a^' 



0.1, and 
aU x(°) and z(o) equal to zero, (2) a''°'^ = 1, af^ = ai"^ = 0.1, 
z(°) equal to zero, and the no-jitter LMMSE estimate for 
x'^"\ (3) the true values of cr^, a^, a^, z, and x, and (4- 
10) seven choices of random values of ct^, a'^, a^, z and 
the corresponding fixed-jitter LMMSE estimates for x. The 
squared errors displayed are normalized so that the squared 
error for the no-jitter LMMSE estimate starting point equals 
one. Although the Gibbs/slice sampler becomes more sensitive 
to initial conditions as a^ increases, in all cases, the squared 
errors for the majority of initial conditions are close to one. 
Thus, even though the algorithms are still sensitive to initial 
conditions after the burn-in period, especially for larger jitter 
variance, the choice of no-jitter LMMSE estimate is about 
average. 

B. Performance Comparisons 

In Figure [TOl the performance of the Gibbs/slice sampler is 
compared against the linear MMSE and no-jitter linear MMSE 
estimators and the EM algorithm approximation to the ML 
estimator derived in ||4l. The MSE performances are plotted for 
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(c) K = 10, M = 4, E [a^] = 0.25^, E [a^] varies. 

Fig. 7. The convergence of the Gibbs/shce sampler (100 chains, 1500 
samples) as a function of the number of samples Ii, + I is measured by 
the PSRF-'^" and ||V||2 convergence metrics. The ||V||2 values are 
normalized by the final value for each curve. The parameters a^, Px, ctz, 
/3z, Ou), and /3u, are determined using (To) and jilt . The rate of convergence 
depends on the choice of parameters, as demonstrated in the above plots. 



different values of M, az, and a^ to demonstrate the effect of 
increasing il/, increasing a^, or decreasing ct^, on the relative 
MSE performances. Comparing the Gibbs/slice sampler Bayes 
MMSE estimate against the linear estimator, the Gibbs/slice 
sampler outperforms the linear MMSE estimator for a large 
range of Cz, a difference that becomes more pronounced 
with higher oversampling M. In addition, the results suggest 
that the Gibbs/slice sampler outperforms classical estimation, 
especially for higher jitter variances. 

We also compare computation times for the EM algorithm 
and the Gibbs/slice sampler. Both converge more slowly for 
higher jitter and lower additive noise, and greater oversampling 
also lengthens computation. In the case of K = 10, M — 16, 
E[cr2] = 0.52, and E[cr2 ] = 0.025^, the EM algorithm with 
known crl and a"^ requires 1.6 seconds per trial on average, the 
EM algorithm for random noise variances requires 24 seconds, 
and the Gibbs/slice sampler requires 3.1 seconds on average. 
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(c) K = 10, M = 4, E [cr2] = 0.25^, E [a^] varies. 

Fig. 8. The convergence of the estimator for x from the Gibbs/slice sampler 
(/(, = 500, 1 < / < 1000 samples) is measured from 1000 trials by the 
MSB of the Gibbs sampler estimate of x; the MSB is normaUzed so the MSB 
for I = 1000 samples is dB. The parameters a^, fix, ctz, /3z, a^, and /3^ 
are determined using (To) and (TT). The rate of convergence (when the error 
line stabilizes) depends on the choice of parameters, as demonstrated in the 
above plots. 



In only an eighth the time, the Gibbs/slice sampler achieves 
greater MSE performance than the EM algorithm. 

To understand the effectiveness of these methods in mitigat- 
ing jitter, the difference in jitter variance as a function of target 
MSE is computed based on the performance results and the 
maximum observed differences (for E[cr2]^/^ > ^E[o'^]^/^, to 
avoid the region where the MSE plots are flat) are compared 
for different values of M and £[0-^]^/^. The resulting trends 
portrayed in Figure[TT]demonstrate that greater improvement is 
achievable with increased oversampling M, and small additive 
noise variance E[(t^]. In addition, the Gibbs/slice sampler 
outperforms the classical ML estimator (as approximated by 
the EM algorithm in ||4l) at high jitter, increasing the factor 
of improvement, especially in the high oversampling and low 
additive noise variance regimes. 

VI. Conclusion 

The results displayed in this paper suggest that post- 
processing jittered samples with a nonlinear algorithm like 
Gibbs/slice sampling mitigates the effect of sampling jitter 
on the total sampling error In particular, the expected jitter 
standard deviation can be increased by as much as a factor of 
2.2, enabling substantial power savings in the analog circuitry 
when compared against linear post-processing or classical 
nonlinear post-processing (the EM algorithm). Such power 
savings may enable significant improvements in battery life 



m 5 



-10 



-15 




(a) K = 10, E [al] 



8 16 

oversampling factor (M) 

= 0.252, E \al] = 0.252, M varies. 



-10 



-15 



0.01 



0.1 



0.25 



average jitter std. dev. (E{j ] ) 



.2l1/2i 



(b) K = 10, A/ = 8, E [al,] = 0.25^, E [a^] varies. 




0.1 



0.25 



0.5 



average AWGN std. dev. {E^^f^) 



(c) K = 10, Af = 8, E [0-2] = 0.252, E [crl] varies. 

Fig. 9. The effects of varying initial conditions of the Gibbs/slice sampler 
as a function of oversampling factor |(a)| jitter variance |(b)| and additive 
noise variance |(c)| are studied by computing the squared errors of the results, 
for multiple initial conditions, across 50 trials. The squared en'ors of the 
results are normalized relative to the result for initialization with the zero- 
jitter LMMSE in (22), so that the squared error of the result for initialization 
with this linear estimator is dB. The parameters a^, fix, ciz, fiz, ctw, and 
Pm are determined using UO) and (TT). 



for implantable cardiac pacemakers and enable the inclusion 
of ADCs in ultra-low power devices. 

Like the EM algorithm proposed in |4|, the Gibbs/slice 
sampler proposed here suffers from relatively high compu- 
tational complexity and an iterative nature, which may be 
unsuitable for embedded applications. Developments in poly- 
nomial estimators, such as the Volterra filter-like polynomial 
estimators described in f29\, may yield similar performance to 
the Gibbs/slice sampler proposed here, at least for low levels 
of oversampling, without such high online computational cost. 
Further investigation is warranted in developing these and sim- 
ilar approaches for post-processing jittered samples in ADCs. 
Nevertheless, for off-chip post-processing of jittered samples, 
the nonlinear Bayesian Gibbs/slice sampler presented here 
outperforms both linear MMSE estimator and the nonlinear 
classical EM algorithm approximation to the ML estimator. 
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Fig. 10. The MSB performance of the Bayes MMSE estimator as computed 
using the Gibbs/slice sampler is compared against both the unbiased linear 
MMSE estimator J20) and the no-jitter linear MMSE estimator )22K as well 
as the EM algorithm approximation to the ML estimator from |4|. The values 
of ax, Px, ctz, /3z, ctw, and /3„, ai'e determined for the average cr^, cr^, 
and (tJ using (To) and jilt . The EM algorithm uses the true values of cr^, 
cr^ , and a^ , while the linear estimators and the Gibbs/slice sampler treat a^ , 
(T^, and (T^ as random variables. The error bars above and below each data 
point for the estimators delineate the 95% confidence intervals for those data 
points. 



Appendix 
ML Estimation with Random Variances 



In f4l, the EM algorithm approximation to the ML estimator 
is derived in the classical setting for known variances n^ and 
cr^. To adapt the method for random variances, we introduce 
a'i and ct^, as latent variables: 
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Fig. 1 1 . Jitter improvement from using MMSE (Gibbs/slice sampler) and 
ML estimators (EM algorithm with known a^, Tz, and cr^) is measured by 
interpolating the maximum factor of improvement in jitter tolerance, measured 
by E [fT^] > relative to using no-jitter LMMSE reconstruction. Holding 
E [iTu,] fixed, [(a)] shows the trend in maximum improvement as M increases, 

and |(b)| shows the trend in maximum improvement as E [o"^] increases 
while holding M fixed. The jitter standard deviation cr* corresponding to this 
maximum improvement for the MMSE and ML estimators is plotted on the 
same axes. 



By conditional independence, 

p(y,z,CTf,CT^;x) =p(y I z,cr2;x)p(z I (Tl)p{(jl)p{al,) 

= AA(y;H(z)x,a2l)p(z | al)p{al)p{al). 

(42) 

The terms not involving x are unnecessary, since we are 
differentiating with respect to x in the next step. The derivative 
of the expectation in dTTT l is 

2H^(z)(H(z)x-y) 



2al 



y;x(-l) 



(43) 



Setting the derivative equal to zero yields a linear system in 
x: 
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-iT 



y;x 



[t-i] 



As is done in ||4l, the expectations in (l44l l become: 
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(45) 



(46) 
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The hybrid quadrature method discussed in Section |ll] can 
be used to compute the expectations in ( |45l l and ( |46] i: 



E 



h„(z„)h^(z„) 



Vn 
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Jl J2 J3 

EEE 



[10] A. Nordio. C.-F. Chiasserini, and E. Viterbo, "Signal reconstruction 
errors in jittered sampling," IEEE Trans. Signal Process., vol. 57, no. 12, 
pp. 4711^718, Dec. 2009. 
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EEE^ 



jl=iJ2 = l J3-- 



)G. CoV, I\M'Ha?ri^;-iMrD. A. HuiWf^Rriys, "An algorithm for the 

removal of noise and jitter in signals and its application to picosecond 

electrical measurement," Num. Alg., vol. 5, no. 10, pp. 491-508, Oct. 

rj. 1993. 

^J2^J3'^n\?liW Ty?- Tuncgr,, "BloQfej3a.iad n^thdds for the reconstruction of finite- 

iT^ niii I Y(i— Ij'i'lengtn^jgnalsVftiMTOnunifDimjsniples," /iiiiiJ 



(48) 



Hybrid quadrature is also used to compute p{yn \ x'^'"^-') 
(see (fTsTi). Then, the EM algorithm becomes iteratively solv- 
ing (l44l l for x*^'^, using the above hybrid quadrature formulas. 
However, due to the three-dimensional nature of the hybrid 
quadrature formulas, computational cost can increase dramat- 
ically. 

Due to the increased computational cost of adapting the 
EM algorithm to random variances, we compare the MSE 
performance of both EM algorithms for the same choices of 
parameters used in the performance plots in ||4l (1000 trials, 
Jl = J2 = 9, J3 = 129). The MSE performance for both 
algorithms are almost identical, up to only 0.54 dB apart. Thus, 
to reduce computation time when comparing performance 
against the Gibbs/slice sampler, the EM algorithm with known 
variances is used as a proxy for the EM algorithm with random 
variances. 
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